## Loading required package: SummarizedExperiment
## Loading required package: MatrixGenerics
## Loading required package: matrixStats
##
## Attaching package: 'MatrixGenerics'
## The following objects are masked from 'package:matrixStats':
##
## colAlls, colAnyNAs, colAnys, colAvgsPerRowSet, colCollapse,
## colCounts, colCummaxs, colCummins, colCumprods, colCumsums,
## colDiffs, colIQRDiffs, colIQRs, colLogSumExps, colMadDiffs,
## colMads, colMaxs, colMeans2, colMedians, colMins, colOrderStats,
## colProds, colQuantiles, colRanges, colRanks, colSdDiffs, colSds,
## colSums2, colTabulates, colVarDiffs, colVars, colWeightedMads,
## colWeightedMeans, colWeightedMedians, colWeightedSds,
## colWeightedVars, rowAlls, rowAnyNAs, rowAnys, rowAvgsPerColSet,
## rowCollapse, rowCounts, rowCummaxs, rowCummins, rowCumprods,
## rowCumsums, rowDiffs, rowIQRDiffs, rowIQRs, rowLogSumExps,
## rowMadDiffs, rowMads, rowMaxs, rowMeans2, rowMedians, rowMins,
## rowOrderStats, rowProds, rowQuantiles, rowRanges, rowRanks,
## rowSdDiffs, rowSds, rowSums2, rowTabulates, rowVarDiffs, rowVars,
## rowWeightedMads, rowWeightedMeans, rowWeightedMedians,
## rowWeightedSds, rowWeightedVars
## Loading required package: GenomicRanges
## Loading required package: stats4
## Loading required package: BiocGenerics
## Loading required package: parallel
##
## Attaching package: 'BiocGenerics'
## The following objects are masked from 'package:parallel':
##
## clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
## clusterExport, clusterMap, parApply, parCapply, parLapply,
## parLapplyLB, parRapply, parSapply, parSapplyLB
## The following objects are masked from 'package:stats':
##
## IQR, mad, sd, var, xtabs
## The following objects are masked from 'package:base':
##
## anyDuplicated, append, as.data.frame, basename, cbind, colnames,
## dirname, do.call, duplicated, eval, evalq, Filter, Find, get, grep,
## grepl, intersect, is.unsorted, lapply, Map, mapply, match, mget,
## order, paste, pmax, pmax.int, pmin, pmin.int, Position, rank,
## rbind, Reduce, rownames, sapply, setdiff, sort, table, tapply,
## union, unique, unsplit, which.max, which.min
## Loading required package: S4Vectors
##
## Attaching package: 'S4Vectors'
## The following objects are masked from 'package:base':
##
## expand.grid, I, unname
## Loading required package: IRanges
##
## Attaching package: 'IRanges'
## The following object is masked from 'package:destiny':
##
## distance
## Loading required package: GenomeInfoDb
## Loading required package: Biobase
## Welcome to Bioconductor
##
## Vignettes contain introductory material; view with
## 'browseVignettes()'. To cite Bioconductor, see
## 'citation("Biobase")', and for packages 'citation("pkgname")'.
##
## Attaching package: 'Biobase'
## The following object is masked from 'package:MatrixGenerics':
##
## rowMedians
## The following objects are masked from 'package:matrixStats':
##
## anyMissing, rowMedians
## Loading required package: Matrix
##
## Attaching package: 'Matrix'
## The following object is masked from 'package:S4Vectors':
##
## expand
## Loaded glmnet 4.1-4
## Loading required package: scuttle
## Loading required package: ggplot2
## Type 'citation("pROC")' for a citation.
##
## Attaching package: 'pROC'
## The following objects are masked from 'package:IRanges':
##
## cov, var
## The following objects are masked from 'package:S4Vectors':
##
## cov, var
## The following object is masked from 'package:BiocGenerics':
##
## var
## The following objects are masked from 'package:stats':
##
## cov, smooth, var
## aroma.light v3.22.0 (2021-05-19) successfully loaded. See ?aroma.light for help.
## Loading required package: viridisLite
##
## Attaching package: 'viridis'
## The following object is masked from 'package:scales':
##
## viridis_pal
##
## Attaching package: 'gridExtra'
## The following object is masked from 'package:Biobase':
##
## combine
## The following object is masked from 'package:BiocGenerics':
##
## combine
##
## Attaching package: 'data.table'
## The following object is masked from 'package:SummarizedExperiment':
##
## shift
## The following object is masked from 'package:GenomicRanges':
##
## shift
## The following object is masked from 'package:IRanges':
##
## shift
## The following objects are masked from 'package:S4Vectors':
##
## first, second
We use 50% of the cells for clustering, and then infer the hyperplanes separating the clusters using support vector machines, as in Zhang et al. (2019). Cell Systems. (https://doi.org/10.1016/j.cels.2019.07.012). We then use the hyperplanes to infer the cluster membership of the remaining 50% cells and only use those 50% for differential gene expression analysis across clusters. This avoid false positives for differential gene expression that we would obtain by using the same cells for both the clustering and the differential expression (double dipping).
We show the clusters on a UMAP, which we recomputed using only the Mesenchyme and Epicardium labelled cells (and not the entire reference atlas).
We now colour the UMAP by the average log-normalised expression of JCF markers in a cell divided by the average log-normalised expression (JCF score).
The clusters separate clearly in terms of their JCF signature.
We rename the clusters based on JCF scores.
We compute the numbers of epicardium cells in each cluster.
Because of the concordance, we rename mes_low_JCF_2 as Epicardium.
We plot the UMAP coloured by emryonic stage.
We compute the JCF score for tdTom+ versus tdTom- Mixl1 chimera cells that map to the Mesenchyme cell type. The plot below shows that there are the distribution of JCF scores for the tdTom- cells is bimodal - there are two clearly separate peaks. For the tdTom+ cells much fewer cells have higher JCF scores.
We compare this to the WT chimeras.
Using a Wilcoxon rank sum test and see that the effect on the knockout cells is significant for Mixl1 and not for the WT chimeras.
##
## Wilcoxon rank sum test with continuity correction
##
## data: chimeraWT$JCF_score[chimeraWT$tomato == TRUE & chimeraWT$celltype.mapped == "Mesenchyme"] and chimeraWT$JCF_score[chimeraWT$tomato == FALSE & chimeraWT$celltype.mapped == "Mesenchyme"]
## W = 99451, p-value = 0.08787
## alternative hypothesis: true location shift is not equal to 0
## 95 percent confidence interval:
## -0.26669330 0.01837757
## sample estimates:
## difference in location
## -0.1227113
##
## Wilcoxon rank sum test with continuity correction
##
## data: chimeraMixl1$JCF_score[chimeraMixl1$tomato == TRUE & chimeraMixl1$celltype.mapped == "Mesenchyme"] and chimeraMixl1$JCF_score[chimeraMixl1$tomato == FALSE & chimeraMixl1$celltype.mapped == "Mesenchyme"]
## W = 84237, p-value < 2.2e-16
## alternative hypothesis: true location shift is not equal to 0
## 95 percent confidence interval:
## -0.9371406 -0.7135792
## sample estimates:
## difference in location
## -0.8244189
The above already suggests a depletion of JCF as a result of Mixl1 knockout.
We now investigate how this difference in JCF score between the tdTom+ and tdTom- population translates into which cluster of Mesenchyme atlas cells the respective chimera cell is most similar to.
We repeat this for the WT chimeras.
We perform perturbSuite_DA.
First, we plot the trajectory without relabelling.
Now we replot the trajectory with the new labels.
Saving expression matrix and UMAP coordinates